OrienTel – Arabic speech resources for the IT market
نویسندگان
چکیده
A survey of the language resources market clearly shows that the Arabic language is still a stepchild of international R&D efforts in the field of speech recognition. OrienTel for the first time makes an effort to create speech data on a large scale. It does so by profiting from the experience of previous SpeechDat projects and from the European Commission’s policy to embrace non-EU Mediterranean and surrounding countries. The participants of OrienTel will collect Standard and Colloquial varieties of Arabic in Saudi Arabia, the UAE, Egypt, Israel + Palestine, Tunisia and Morocco, supplemented by other languages of the region. Help in creating an Arabic network of speech experts is appreciated.
منابع مشابه
A first experience on multilingual acoustic modeling of the languages spoken in morocco
The goal of this paper is to explore and describe the potential of multilingual acoustic models for automatic speech recognition of the languages spoken in Morocco. The basic experimental framework comes from the OrienTel project, mainly the sound inventory of the Arabic languages and the speech databases. Monolingual and multilingual automatic speech recognition systems for Modern Colloquial a...
متن کاملOrienTel - Telephony Databases Across Northern Africa and the Middle East
OrienTel is a project that over the past two-and-half years developed speech databases and phonetic standards across Northern Africa, the Middle East and the Arabian Gulf. The project is funded by the European Commission and is coordinated by ScanSoft (Germany and Belgium). Other partners are ELDA (France), IBM (Germany), NSC (Israel), Siemens (Germany), Lucent (UK), Knowledge, the University o...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملOrientel: speech-based interactive communication applications for the mediterranean and the middle east
In this paper, we introduce a new European project named OrienTel. The aim of OrienTel is to enable the project's participants to design and develop multilingual interactive communication services for the Mediterranean and the Middle East, ranging from Morocco in the West to the Gulf states in the East, including Turkey and Cyprus. These multilingual applications will be largely speech-based an...
متن کاملOrienTel - Multilingual access to interactive communication services for the Mediterranean and the Middle East
OrienTel is a project funded within the European Commission’s IST framework that focuses on collecting linguistic data for telephony-based IT applications across the Mediterranean and the Middle East. Languages covered in this SpeechDat-based project are Cypriote Greek, Turkish, Hebrew, different varieties of Arabic, French, English and German. Within the project’s lifetime of 30 months, starti...
متن کامل